In the year 2009, each student in a random sample of 6 Bsc. Statistics students in xyz University was asked about their salary package after they completed graduation. In the year 2015, the same question was asked to another sample of 6 students, graduating from the same course. Test to see whether the mean salary offered after graduation has changed over the past 6 years, 95% level of significance


batch2009 : 567, 759, 1029, 400, 998, 936

batch2015 : 820, 960, 700, 545, 769, 1001


Calculations by hand
knitr::include_graphics(paste0(path,"Example1ManualPage1.jpg"))
knitr::include_graphics(paste0(path,"Example1ManualPage1.jpg"))

Execution in R
## load the library required to intergrate R and Python
library(reticulate)

## Load the python libraries
from scipy.stats import ttest_1samp
from scipy.stats import ttest_ind
 
##ttest_1samp: used for carrying out one sample t-tests
##ttest_ind: used for carrying out two sample independent t-tests
## Generate a vector of values
batch2009 <- c(567, 759, 1029, 400, 998, 936)
batch2015 <-c(820, 960, 700, 545, 769, 1001)

## Test for unequal variances
var.test(batch2009,batch2015)#p-value = 0.3878, so we fail to reject the null hypothesis and conclude that the variances are approximately equal

    F test to compare two variances

data:  batch2009 and batch2015
F = 2.276, num df = 5, denom df = 5, p-value = 0.3878
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
  0.3184771 16.2648725
sample estimates:
ratio of variances 
          2.275959 
## Carry out the two sample t-test 
t.test(batch2009,batch2015,alternative = "two.sided",var.equal = TRUE)

    Two Sample t-test

data:  batch2009 and batch2015
t = -0.14168, df = 10, p-value = 0.8901
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -295.4970  260.1637
sample estimates:
mean of x mean of y 
 781.5000  799.1667 
Execution in Python



## Generate a vector of values
batch2009 = [567, 759, 1029, 400, 998, 936]
batch2015 =[820, 960, 700, 545, 769, 1001]

## Carry out the two sample t-test 
ttest_ind(batch2009, batch2015, equal_var = True)
Ttest_indResult(statistic=-0.14168282046447547, pvalue=0.8901442534195294)

Example 2

Dr.Smith is teaching two sections of statistics, with 15 and 19 students respectively. The grades on an exam are as follows.

Section 1: 100,95,90,90,90,90,85,83,80,79,71,71,70,66,48
Section 2: 100,100,100,100,98,98,98,93,93,90,86,83,81,79,79,76,61,48,41

One of the classes asks if they did significantly better or worse on the exam than the other class. Using an alpha level= 0.10, what should Smith tell them?

Calculations by hand


knitr::include_graphics(paste0(path,"Example2ManualPage1.jpg"))

knitr::include_graphics(paste0(path,"Example2ManualPage2.jpg"))

Execution in R


## Generate a vector of values

Section1 <- c(100,95,90,90,90,90,85,83,80,79,71,71,70,66,48)
Section2 <- c(100,100,100,100,98,98,98,93,93,90,86,83,81,79,79,76,61,48,41)

## Test for unequal variances
var.test(Section1,Section2)#p-value = 0.308, so we fail to reject the null hypothesis and conclude that the variances are approximately equal

    F test to compare two variances

data:  Section1 and Section2
F = 0.58175, num df = 14, denom df = 18, p-value = 0.308
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.2157467 1.6751285
sample estimates:
ratio of variances 
         0.5817462 
## Carry out the two sample t-test
t.test(Section1,Section2,alternative = "two.sided",var.equal = TRUE)

    Two Sample t-test

data:  Section1 and Section2
t = -0.70546, df = 32, p-value = 0.4856
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -15.113092   7.337654
sample estimates:
mean of x mean of y 
 80.53333  84.42105 
Execution in Python


## Generate a vector of values
Section1 = [100,95,90,90,90,90,85,83,80,79,71,71,70,66,48]
Section2 = [100,100,100,100,98,98,98,93,93,90,86,83,81,79,79,76,61,48,41]

## Carry out the two sample t-test 
ttest_ind(Section1, Section2, equal_var = True)
Ttest_indResult(statistic=-0.7054576233706772, pvalue=0.48562953031548084)

Additional Resources: